Search CORE

4 research outputs found

Mega-Reward: Achieving Human-Level Play without Extrinsic Rewards

Author: Lukasiewicz Thomas
Song Yuhang
Wang Jianyi
Wojcicki Andrzej
Xu Mai
Xu Zhenghua
Zhang Shangtong
Publication venue
Publication date: 26/11/2019
Field of study

Intrinsic rewards were introduced to simulate how human intelligence works; they are usually evaluated by intrinsically-motivated play, i.e., playing games without extrinsic rewards but evaluated with extrinsic rewards. However, none of the existing intrinsic reward approaches can achieve human-level performance under this very challenging setting of intrinsically-motivated play. In this work, we propose a novel megalomania-driven intrinsic reward (called mega-reward), which, to our knowledge, is the first approach that achieves human-level performance in intrinsically-motivated play. Intuitively, mega-reward comes from the observation that infants' intelligence develops when they try to gain more control on entities in an environment; therefore, mega-reward aims to maximize the control capabilities of agents on given entities in a given environment. To formalize mega-reward, a relational transition model is proposed to bridge the gaps between direct and latent control. Experimental studies show that mega-reward (i) can greatly outperform all state-of-the-art intrinsic reward approaches, (ii) generally achieves the same level of performance as Ex-PPO and professional human-level scores, and (iii) has also a superior performance when it is incorporated with extrinsic rewards

arXiv.org e-Print Archive

Oxford University Research Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Arena: A General Evaluation Platform and Building Toolkit for Multi-Agent Intelligence

Author: Aryan Abi
Ding Zihan
Lukasiewicz Thomas
Song Yuhang
Wang Jianyi
Wojcicki Andrzej
Wu Lianlong
Xu Mai
Xu Zhenghua
Publication venue
Publication date: 27/11/2019
Field of study

Learning agents that are not only capable of taking tests, but also innovating is becoming a hot topic in AI. One of the most promising paths towards this vision is multi-agent learning, where agents act as the environment for each other, and improving each agent means proposing new problems for others. However, existing evaluation platforms are either not compatible with multi-agent settings, or limited to a specific game. That is, there is not yet a general evaluation platform for research on multi-agent intelligence. To this end, we introduce Arena, a general evaluation platform for multi-agent intelligence with 35 games of diverse logics and representations. Furthermore, multi-agent intelligence is still at the stage where many problems remain unexplored. Therefore, we provide a building toolkit for researchers to easily invent and build novel multi-agent problems from the provided game set based on a GUI-configurable social tree and five basic multi-agent reward schemes. Finally, we provide Python implementations of five state-of-the-art deep multi-agent reinforcement learning baselines. Along with the baseline implementations, we release a set of 100 best agents/teams that we can train with different training schemes for each game, as the base for evaluating agents with population performance. As such, the research community can perform comparisons under a stable and uniform standard. All the implementations and accompanied tutorials have been open-sourced for the community at https://sites.google.com/view/arena-unity/

arXiv.org e-Print Archive

Oxford University Research Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Packet dropping characteristics in a queue with autocorrelated arrivals

Author: Andrzej Chydzinski
Grzegorz Hryn
Robert Wojcicki
Publication venue: Croatian Communications and Information Society (CCIS)
Publication date: 01/01/2008
Field of study

This paper provides a detailed description of the packet dropping process connected with the buffer overflows in a network node. Namely, we show the formulas for the most important loss characteristics, both in the transient and the stationary regime and then illustrate them via numericalexamples. In order to make it possible to obtain the droppingcharacteristics for strongly autocorrelated arrivals, the Markovmodulated Poisson process is used as a traffic model

Directory of Open Access Journals

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Positivism and Types of Theories in Sociology

Author: Alexander Jeffrey C.
Baker Wayne E.
Berger Joseph
Berger Joseph
Berger Joseph
Berger Joseph
Berger Joseph
Berger Joseph
Blau Peter
Bridgman Percy W.
Butterfield Herbert
Cohen Bernard P.
Cohen Bernard P.
Coleman James S.
Comte
Dixon Keith
Fararo Thomas J.
Feigl Herbert
Feyerabend Paul K.
Fiedler Fred
Foschi Martha
Foschi Martha
Freese Lee
Freese Lee
Giddens Anthony
Giddens Anthony
Giedymin Jerzy
Halfpenny Peter
Hawking Stephen
Hechter Michael
Heilbron Johan
Heimer Karen
Hempel Carl G.
Hempel Carl G.
Hindess Barry
Janis Irving L.
Jasso Guillermina
Jasso Guillermina
Jasso Guillermina
Joseph Berger
Keat Russell
Kohn Melvin L.
Kolakowski Leszek
Kuhn Thomas
Lakatos Imre
Lakatos Imre
Lakatos Imre
Laudan Larry
Lovaglia Michael
Lovaglia Michael J.
Lundberg George A.
Markovsky Barry
Markovsky Barry
Markovsky Barry
Matsueda Ross L.
Merton Robert K.
Nagel Ernest
Neurath Otto
Popper Karl R.
Popper Karl R.
Popper Karl R.
Popper Karl R.
Quine W. V. O.
Quine W. V. O.
Ridgeway Cecilia
Ridgeway Cecilia
Siemianowski Andrzej
Skvoretz John
Skvoretz John
Skvoretz John
Skvoretz John
Skvoretz John
Slomczynski Kazimierz M.
Slomczynski Kazimierz M.
Szmatka Jacek
Szmatka Jacek
Szmatka Jacek
Szmatka Jacek
Szmatka Jacek
Tatarkiewicz Wladyslaw
Toulmin Steven
Turner Jonathan H.
Turner Jonathan H.
Turner Jonathan H.
Turner Jonathan H.
Wacquant Loic J.D.
Wagner David G.
Walder A.
Wartofsky Max W.
Weber Max
Webster Murray
Wesolowski Wlodzimierz
Willer David
Willer David
Willer David
Willer David
Wojcicki Ryszard
Wolenski Jan
Zetterberg Hans L.
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref